perf(pipeline): Improve execution times for dense pipeline graphs #4824

dzhengg · 2025-01-16T00:23:02Z

There's a number of performance improvements here:

Memoize the anyUpstreamStagesFailed extension function to improve time complexity from exponential to linear
Optimize getAncestorsImpl to reduce time complexity by a factor of N, where N is the number of stages in a pipeline
Optimize StartStageHandler to only call withAuth (which calls getAncestorsImpl) when needed

Overall, these improvements reduce pipeline execution times across the board, with the biggest gains seen from very dense pipeline graphs such as the contrived example in ComplexPipeline.kt. After the optimizations, the timing of the test added to StartStageHandler improves from never completing to finishing in ~160ms on my machine.

christosarvanitis · 2025-01-16T07:26:52Z

orca-core/src/main/java/com/netflix/spinnaker/orca/pipeline/model/StageExecutionInternals.java

              .collect(toList());
      List<StageExecution> syntheticStages =
          stage.getExecution().getStages().stream()
+              .filter(s -> s.getSyntheticStageOwner() != null)


Was it really going through all the stages before and not only the STAGE_BEFORE/AFTER?

That is a good catch 🚀 👏

when given a complex pipeline with multiple layers of upstream stages

This turns the anyUpstreamStagesFailed calculation from one that scales exponentially based on the number of (branches+downstream stages) in a pipeline to one that scales linearly based on the total number of stages in a pipeline. This is a significant performance improvement, especially for very large and complicated pipelines

since that's the only place where it gets used

for memoization. The recursive anyUpstreamStagesFailed(StageExecution) function runs in a single thread, so ConcurrentHashMap is not necessary here

before checking for parent stages. stage.getRequisiteStageRefIds is a more expensive call because the underlying implementation creates a copy of a List. Therefore, start with the cheaper operation first hoping to short-circuit and avoid the more expensive check

The javadocs state that the syntheticStageOwner property is null for non-synthetic stages. Use this information to filter out non-synthetic stages before attempting a potentially expensive operation to check for synthetic parents of previousStages

StageExecutionImpl.getRequisiteStageRefIds() returns a copy of a Set. This is a costly operation that has the potential to get repeated for every unvisited stage. To avoid this, compute the value before entering a loop

withAuth is only necessary when starting a stage. Since withAuth is very computationally expensive for complex pipelines, only call it when it is absolutely necessary.

StartStageHandler already makes a call to message.withStage at the beginning of the handle() method. Therefore, this call within the catch block is unnecessary

dbyron-sf approved these changes Jan 16, 2025

View reviewed changes

christosarvanitis reviewed Jan 16, 2025

View reviewed changes

Daniel Zheng added 11 commits January 16, 2025 08:48

test(pipeline): Define StartStageHandler performance

87fdd27

when given a complex pipeline with multiple layers of upstream stages

refactor(stage): Move anyUpstreamStagesFailed to StartStageHandler

0cf61bb

since that's the only place where it gets used

fix(stage): Avoid using a ConcurrentHashMap

188ad02

for memoization. The recursive anyUpstreamStagesFailed(StageExecution) function runs in a single thread, so ConcurrentHashMap is not necessary here

docs(test): Use a more concise test name

dd92e9c

perf(stage): Precompute requisiteStageRefIds

388dc75

StageExecutionImpl.getRequisiteStageRefIds() returns a copy of a Set. This is a costly operation that has the potential to get repeated for every unvisited stage. To avoid this, compute the value before entering a loop

perf(stage): Only use withAuth when needed

5a6775e

withAuth is only necessary when starting a stage. Since withAuth is very computationally expensive for complex pipelines, only call it when it is absolutely necessary.

perf(stage): Remove duplicate call to withStage

4ac447f

StartStageHandler already makes a call to message.withStage at the beginning of the handle() method. Therefore, this call within the catch block is unnecessary

chore(import): Clean up unused imports

72fb2bf

dzhengg force-pushed the dense-pipeline-perf branch from 7dbef46 to 72fb2bf Compare January 16, 2025 16:51

christosarvanitis approved these changes Jan 16, 2025

View reviewed changes

dbyron-sf added the ready to merge Approved and ready for merge label Jan 17, 2025

mergify bot added the auto merged Merged automatically by a bot label Jan 17, 2025

mergify bot merged commit 93f27cc into spinnaker:master Jan 17, 2025
4 checks passed

dzhengg deleted the dense-pipeline-perf branch January 17, 2025 17:53

spinnakerbot added the target-release/1.37 label Jan 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(pipeline): Improve execution times for dense pipeline graphs #4824

perf(pipeline): Improve execution times for dense pipeline graphs #4824

dzhengg commented Jan 16, 2025

christosarvanitis Jan 16, 2025

dzhengg Jan 16, 2025

christosarvanitis Jan 16, 2025

perf(pipeline): Improve execution times for dense pipeline graphs #4824

perf(pipeline): Improve execution times for dense pipeline graphs #4824

Conversation

dzhengg commented Jan 16, 2025

christosarvanitis Jan 16, 2025

Choose a reason for hiding this comment

dzhengg Jan 16, 2025

Choose a reason for hiding this comment

christosarvanitis Jan 16, 2025

Choose a reason for hiding this comment